AITopics | graph feedback

Collaborating Authors

graph feedback

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stochastic contextual bandits with graph feedback: from independence number to MAS number

Neural Information Processing SystemsMar-21-2026, 04:40:46 GMT

We consider contextual bandits with graph feedback, a class of interactive learning problems with richer structures than vanilla contextual bandits, where taking an action reveals the rewards for all neighboring actions in the feedback graph under all contexts. Unlike the multi-armed bandits setting where a growing literature has painted a near-complete understanding of graph feedback, much remains unexplored in the contextual bandits counterpart. In this paper, we make inroads into this inquiry by establishing a regret lower bound $\Omega(\sqrt{\beta_M(G) T})$, where $M$ is the number of contexts, $G$ is the feedback graph, and $\beta_M(G)$ is our proposed graph-theoretic quantity that characterizes the fundamental learning limit for this class of problems. Interestingly, $\beta_M(G)$ interpolates between $\alpha(G)$ (the independence number of the graph) and $\mathsf{m}(G)$ (the maximum acyclic subgraph (MAS) number of the graph) as the number of contexts $M$ varies. We also provide algorithms that achieve near-optimal regret for important classes of context sequences and/or feedback graphs, such as transitively closed graphs that find applications in auctions and inventory control. In particular, with many contexts, our results show that the MAS number essentially characterizes the statistical complexity for contextual bandits, as opposed to the independence number in multi-armed bandits.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Industry: Education (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Stochastic contextual bandits with graph feedback: from independence number to MAS number Y uxiao Wen Y anjun Han

Neural Information Processing SystemsFeb-15-2026, 22:37:24 GMT

The framework of formulating the feedback structure as feedback graphs in bandits has a long history (Mannor and Shamir, 2011; Alon et al., 2015, 2017; Lykouris et al.,

bandit, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

Julian Zimmert, Tor Lattimore

Neural Information Processing SystemsFeb-12-2026, 23:11:37 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, bandit, osmd, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Understanding Bandits with Graph Feedback

Neural Information Processing SystemsDec-24-2025, 22:33:30 GMT

The bandit problem with graph feedback, proposed in [Mannor and Shamir, NeurIPS 2011], is modeled by a directed graph $G=(V,E)$ where $V$ is the collection of bandit arms, and once an arm is triggered, all its incident arms are observed. A fundamental question is how the structure of the graph affects the min-max regret. We propose the notions of the fractional weak domination number $\delta^*$ and the $k$-packing independence number capturing upper bound and lower bound for the regret respectively. We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual --- the fractional vertex packing set respectively.

frac, graph feedback, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.42)

Add feedback

Stochastic contextual bandits with graph feedback: from independence number to MAS number Y uxiao Wen Y anjun Han

Neural Information Processing SystemsOct-10-2025, 06:29:37 GMT

The framework of formulating the feedback structure as feedback graphs in bandits has a long history (Mannor and Shamir, 2011; Alon et al., 2015, 2017; Lykouris et al.,

algorithm, bandit, contextual bandit, (15 more...)

Neural Information Processing Systems

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Understanding Bandits with Graph Feedback

Neural Information Processing SystemsOct-9-2025, 16:19:46 GMT

We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual -- the fractional vertex packing set respectively.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.36)

Add feedback

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

Julian Zimmert, Tor Lattimore

Neural Information Processing SystemsOct-3-2025, 05:53:15 GMT

In most applications there is a tantalising similarity to the classical analysis based on mirror descent.

bandit, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Stochastic contextual bandits with graph feedback: from independence number to MAS number

Neural Information Processing SystemsMay-27-2025, 05:42:46 GMT

We consider contextual bandits with graph feedback, a class of interactive learning problems with richer structures than vanilla contextual bandits, where taking an action reveals the rewards for all neighboring actions in the feedback graph under all contexts. Unlike the multi-armed bandits setting where a growing literature has painted a near-complete understanding of graph feedback, much remains unexplored in the contextual bandits counterpart. In this paper, we make inroads into this inquiry by establishing a regret lower bound \Omega(\sqrt{\beta_M(G) T}), where M is the number of contexts, G is the feedback graph, and \beta_M(G) is our proposed graph-theoretic quantity that characterizes the fundamental learning limit for this class of problems. Interestingly, \beta_M(G) interpolates between \alpha(G) (the independence number of the graph) and \mathsf{m}(G) (the maximum acyclic subgraph (MAS) number of the graph) as the number of contexts M varies. We also provide algorithms that achieve near-optimal regret for important classes of context sequences and/or feedback graphs, such as transitively closed graphs that find applications in auctions and inventory control.

artificial intelligence, contextual bandit, machine learning, (6 more...)

Neural Information Processing Systems

Industry: Education (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

Understanding Bandits with Graph Feedback

Neural Information Processing SystemsJan-19-2025, 06:03:40 GMT

The bandit problem with graph feedback, proposed in [Mannor and Shamir, NeurIPS 2011], is modeled by a directed graph G (V,E) where V is the collection of bandit arms, and once an arm is triggered, all its incident arms are observed. A fundamental question is how the structure of the graph affects the min-max regret. We propose the notions of the fractional weak domination number \delta * and the k -packing independence number capturing upper bound and lower bound for the regret respectively. We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual --- the fractional vertex packing set respectively. Therefore, our bounds are tight up to a \left(\log V \right) {\frac{1}{3}} factor on graphs with bounded integrality gap for the vertex packing problem including trees and graphs with bounded degree.

frac, graph feedback, integrality gap, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.45)

Add feedback

Graph Feedback Bandits with Similar Arms

Qi, Han, Fei, Guo, Zhu, Li

arXiv.org Artificial IntelligenceMay-18-2024

In this paper, we study the stochastic multi-armed bandit problem with graph feedback. Motivated by the clinical trials and recommendation problem, we assume that two arms are connected if and only if they are similar (i.e., their means are close enough). We establish a regret lower bound for this novel feedback structure and introduce two UCB-based algorithms: D-UCB with problem-independent regret upper bounds and C-UCB with problem-dependent upper bounds. Leveraging the similarity structure, we also consider the scenario where the number of arms increases over time. Practical applications related to this scenario include Q\&A platforms (Reddit, Stack Overflow, Quora) and product reviews in Amazon and Flipkart. Answers (product reviews) continually appear on the website, and the goal is to display the best answers (product reviews) at the top. When the means of arms are independently generated from some distribution, we provide regret upper bounds for both algorithms and discuss the sub-linearity of bounds in relation to the distribution of means. Finally, we conduct experiments to validate the theoretical results.

algorithm, graph, optimal arm, (16 more...)

arXiv.org Artificial Intelligence

2405.11171

Country: Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback